17 research outputs found

    Discovery of large genomic inversions using long range information.

    Get PDF
    BackgroundAlthough many algorithms are now available that aim to characterize different classes of structural variation, discovery of balanced rearrangements such as inversions remains an open problem. This is mainly due to the fact that breakpoints of such events typically lie within segmental duplications or common repeats, which reduces the mappability of short reads. The algorithms developed within the 1000 Genomes Project to identify inversions are limited to relatively short inversions, and there are currently no available algorithms to discover large inversions using high throughput sequencing technologies.ResultsHere we propose a novel algorithm, VALOR, to discover large inversions using new sequencing methods that provide long range information such as 10X Genomics linked-read sequencing, pooled clone sequencing, or other similar technologies that we commonly refer to as long range sequencing. We demonstrate the utility of VALOR using both pooled clone sequencing and 10X Genomics linked-read sequencing generated from the genome of an individual from the HapMap project (NA12878). We also provide a comprehensive comparison of VALOR against several state-of-the-art structural variation discovery algorithms that use whole genome shotgun sequencing data.ConclusionsIn this paper, we show that VALOR is able to accurately discover all previously identified and experimentally validated large inversions in the same genome with a low false discovery rate. Using VALOR, we also predicted a novel inversion, which we validated using fluorescent in situ hybridization. VALOR is available at https://github.com/BilkentCompGen/VALOR

    Organization and evolution of Gorilla centromeric DNA from old strategies to new approaches

    No full text
    The centromere/kinetochore interaction is responsible for the pairing and segregation of replicated chromosomes in eukaryotes. Centromere DNA is portrayed as scarcely conserved, repetitive in nature, quickly evolving and protein-binding competent. Among primates, the major class of centromeric DNA is the pancentromeric α-satellite, made of arrays of 171 bp monomers, repeated in a head-to-tail pattern. α-satellite sequences can either form tandem heterogeneous monomeric arrays or assemble in higher-order repeats (HORs). Gorilla centromere DNA has barely been characterized, and data are mainly based on hybridizations of human alphoid sequences. We isolated and finely characterized gorilla α-satellite sequences and revealed relevant structure and chromosomal distribution similarities with other great apes as well as gorilla-specific features, such as the uniquely octameric structure of the suprachromosomal family-2 (SF2). We demonstrated for the first time the orthologous localization of alphoid suprachromosomal families-1 and -2 (SF1 and SF2) between human and gorilla in contrast to chimpanzee centromeres. Finally, the discovery of a new 189 bp monomer type in gorilla centromeres unravels clues to the role of the centromere protein B, paving the way to solve the significance of the centromere DNA's essential repetitive nature in association with its function and the peculiar evolution of the α-satellite sequence

    Eight million years of maintained heterozygosity in chromosome homologs of cercopithecine monkeys

    No full text
    In the Cercopithecini ancestor two chromosomes, homologous to human chromosomes 20 and 21, fused to form the Cercopithecini specific 20/21 association. In some individuals from the genus Cercopithecus, this association was shown to be polymorphic for the position of the centromere, suggesting centromere repositioning events. We set out to test this hypothesis by defining the evolutionary history of the 20/21 association in four Cercopithecini species from three different genera. The marker order of the various 20/21 associations was established using molecular cytogenetic techniques, including an array of more than 100 BACs. We discovered that five different forms of the 20/21 association were present in the four studied Cercopithecini species. Remarkably, in the two Cercopithecus species, we found individuals in which one homolog conserved the ancestral condition, but the other homolog was highly rearranged. The phylogenetic analysis showed that the heterozygosity in these two species originated about 8 million years ago and was maintained for this entire arc of time, surviving multiple speciation events. Our report is a remarkable extension of Dobzhansky’s pioneering observation in Drosophila concerning the maintenance of chromosomal heterozygosity due to selective advantage. Dobzhansky’s hypothesis recently received strong support in a series of detailed reports on the fruit fly genome. Our findings are first extension to primates, indeed to Old World monkeys phylogenetically close to humans of an analogous situation. Our results have important implications for hypotheses on how chromosome rearrangements, selection, and speciation are related

    Genomics technologies to study structural variations in the grapevine genome

    No full text
    Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs) on plant genomes, few data are available on copy number variation (CNV). Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV) for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation

    Genomic technologies uncover inter-varietalstructural variation in grapevine

    No full text
    Grapevine (Vitis vinifera L.) is one of the most important crop plants in the world because of its economically valuable role in fruit and wine production; for this reason, great interest has recently been shown in identifying genomic variations and their functional effects on inter-varietal phenotypic differences. Discovery and characterization of all genetic variations is critical to reach a full understanding of the genetic basis of phenotypic differences. Using an approach developed for the analysis of human and mammalian genomes, which combines high-throughput sequencing (HTS), array comparative genomic hybridization (aCGH), fluorescent in situ hybridization (FISH), and quantitative PCR (qPCR), we were able to create an inter-varietal atlas of structural variations and single nucleotide variants (SNVs) for the grapevine genome analyzing four economically and genetically relevant table grapevine varieties. We found 4.8 million SNVs and detected roughly 8% of the grapevine genome affected by genomic variations. We identified more than 700 copy number variation (CNV) regions, and searching for gene content in these regions, we were able to find more than 2000 genes subjected to CNV as potential candidates for phenotypic differences between varieties. For example, we found a polymorphic region at chromosome 18_random containing six different genes belonging to the germacrene D synthase family and a flavonol synthase gene, which showed the highest rate of duplication in the Italia cultivar. Since Italia is the only cultivar (among those analyzed in this study) showing aromatic taste, this finding could be related to berry flavor trait. Likewise, comparison among the different grape genomes showed differences in gene dosages playing critical roles in response to biotic and abiotic stresses. Overall our data highlight the significance of these genome-wide studies on CNVs in the grapevine genome and identify candidate genes for some of the most complex and desired traits in breeding. We paired-end sequenced four table grape cultivars: Autumn royal (AR), Italia (It), Red globe (RG), and Thompson seedless (TS) and alignment of the obtained reads against the PN40024 Pinot noir reference genome (Table 1 and Figure 1). SNPs identification and WSSD analysis We identified 4,478,098 SNPs and 262,395 INDELs. Similar percentages of duplication (average 16%) and deletion (average 3%) were found in the four table grape varieties and in the PN40024 reference

    Genomics technologies to study structural variations in the grapevine genome

    No full text
    Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs) on plant genomes, few data are available on copy number variation (CNV). Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV) for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation
    corecore